Improving Word Sense Disambiguation with Linguistic Knowledge from a Sense Annotated Treebank
نویسندگان
چکیده
In this paper we present an approach for the enrichment of WSD knowledge bases with data-driven relations from a gold standard corpus (annotated with word senses, valency information, syntactic analyses, etc.). We focus on Bulgarian as a use case, but our approach is scalable to other languages as well. For the purpose of exploring such methods, the Personalized Page Rank algorithm was used. The reported results show that the addition of new knowledge improves the accuracy of WSD with approximately 10.5%.
منابع مشابه
Improving Parsing and PP Attachment Performance with Sense Information
To date, parsers have made limited use of semantic information, but there is evidence to suggest that semantic features can enhance parse disambiguation. This paper shows that semantic classes help to obtain significant improvement in both parsing and PP attachment tasks. We devise a gold-standard senseand parse tree-annotated dataset based on the intersection of the Penn Treebank and SemCor, a...
متن کاملBilingual English-Czech Valency Lexicon Linked to a Parallel Corpus
This paper presents a resource and the associated annotation process used in a project of interlinking Czech and English verbal translational equivalents based on a parallel, richly annotated dependency treebank containing also valency and semantic roles, namely the Prague Czech-English Dependency Treebank. One of the main aims of this project is to create a high-quality and relatively large em...
متن کاملWord Sense vs. Word Domain Disambiguation: A Maximum Entropy Approach
In this paper, a supervised learning system of word sense disambiguation is presented. It is based on conditional maximum entropy models. This system acquires the linguistic knowledge from an annotated corpus and this knowledge is represented in the form of features. The system were evaluated both using WordNet’s senses and domains as the sets of classes of each word. Domain labels are obtained...
متن کاملImproving Dependency Parsing with Semantic Classes
This paper presents the introduction of WordNet semantic classes in a dependency parser, obtaining improvements on the full Penn Treebank for the first time. We tried different combinations of some basic semantic classes and word sense disambiguation algorithms. Our experiments show that selecting the adequate combination of semantic features on development data is key for success. Given the ba...
متن کاملExploiting Semantic Role Resources for Preposition Disambiguation
This article describes how semantic role resources can be exploited for preposition disambiguation. The main resources include the semantic role annotations provided by the Penn Treebank and FrameNet tagged corpora. The resources also include the assertions contained in the Factotum knowledge base, as well as information from Cyc and Conceptual Graphs. A common inventory is derived from these i...
متن کامل